Dimensionality Reduction with Unsupervised Nearest Neighbors
نویسنده
چکیده
The growing information infrastructure in a variety of disciplines involves an increasing requirement for efficient data mining techniques. Fast dimensionality reduction methods are important for understanding and processing of large data sets of high-dimensional patterns. In this work, unsupervised nearest neighbors (UNN), an efficient iterative method for dimensionality reduction, is presented. Starting with an introduction to machine learning and dimensionality reduction, the framework for unsupervised regression is introduced, which is the basis of UNN. Algorithmic variants are developed step by step, reaching from a simple iterative strategy in discrete latent spaces to stochastic kernel-based submanifolds with independent parameterizations. Experimental comparisons to related methodologies taking into account realworld data sets and missing data scenarios show the behavior of UNN in practical scenarios.
منابع مشابه
Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data
Different aspects of the curse of dimensionality are known to present serious challenges to various machine-learning methods and tasks. This paper explores a new aspect of the dimensionality curse, referred to as hubness, that affects the distribution of k-occurrences: the number of times a point appears among the k nearest neighbors of other points in a data set. Through theoretical and empiri...
متن کاملOn Evolutionary Approaches to Unsupervised Nearest Neighbor Regression
The detection of structures in high-dimensional data has an important part to play in machine learning. Recently, we proposed a fast iterative strategy for non-linear dimensionality reduction based on the unsupervised formulation of K-nearest neighbor regression. As the unsupervised nearest neighbor (UNN) optimization problem does not allow the computation of derivatives, the employment of dire...
متن کاملUnsupervised K-Nearest Neighbor Regression
In many scientific disciplines structures in highdimensional data have to be found, e.g., in stellar spectra, in genome data, or in face recognition tasks. In this work we present a novel approach to non-linear dimensionality reduction. It is based on fitting K-nearest neighbor regression to the unsupervised regression framework for learning of low-dimensional manifolds. Similar to related appr...
متن کاملNonlinear Dimensionality Reduction using Approximate Nearest Neighbors
Nonlinear dimensionality reduction methods often rely on the nearest-neighbors graph to extract low-dimensional embeddings that reliably capture the underlying structure of high-dimensional data. Research however has shown that computing nearest neighbors of a point from a highdimensional data set generally requires time proportional to the size of the data set itself, rendering the computation...
متن کاملOn Linear Embeddings and Unsupervised Feature Learning
The ability to train deep architectures has led to many developments in parametric, non-linear dimensionality reduction but with little attention given to algorithms based on convolutional feature extraction without backpropagation training. This paper aims to fill this gap in the context of supervised Mahalanobis metric learning. Modifying two existing approaches to model latent space similari...
متن کامل